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Abstract 

We present a general method for de-amortizing essentially any Binary Search Tree (BST) 
algorithm. In particular, by transforming Splay Trees, our method produces a BST that has 
the same asymptotic cost as Splay Trees on any access sequence while performing each search 
in O(logri) worst case time. By transforming Multi-Splay Trees, wc obtain a BST that is 
O (log log n) competitive, satisfies the scanning theorem, the static optimality theorem, the static 
finger theorem, the working set theorem, and performs each search in O(logn) worst case time. 
Moreover, we prove that if there is a dynamically optimal BST algorithm, then there is a 
dynamically optimal BST algorithm that answers every search in O(logn) worst case time. 
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1 Introduction 



Over half a century since the discovery of rotation-based Binary Search Trees, their exact per- 
formance is still not fully understood. The very first works on BST focused on maintaining the 
tree balanced (O(logn) height and search time) after performing insertions and deletions [Hill], 
or guaranteeing better average case bounds for searches with known distributions [22) . 

By introducing splay trees [13], Sleator and Tarjan proposed an alternate view of the problem, 
where instead of looking at the cost of individual searches, it is the entire cost of a sequence of 
accesses which is bounded, using amortized analysis. 

The purpose of this article is to show that the two approaches are not exclusive — i.e., that 
it is possible to combine the good amortized performances of self-adjusting and other adaptive 
BST with strong worst case guarantees for individual searches. 

The BST Model. In order to describe accurately our results, we choose one BST model 
among several existing standard variants, most of which are asymptotically equivalent. In line 
with previous work, we will not consider insertions and deletions. Hence, our BST model consists 
of a binary search tree T containing the n distinct keys {1, 2, . . . , n} with their natural order. 
The position of a finger, initially at the root of T, is maintained, and the following two BST 
operations, each of unit cost, are allowed: 1) moving the finger from a node to its parent or to 
one of its children, and 2) performing a rotation between the node pointed to by the finger and 
its parent. 

Given the current tree T and the current finger position, an access to a key a; is a list of BST 
operations (finger movements and rotations), during which the finger position is at the node 
containing x at least once. 

For an input sequence S ~ (si, S2, • ■ • , Sm) of keys to be accessed, a BST algorithm A that 
realizes S returns a list A{S) of BST operations for accessing the keys si, S2, . . . in that order — 
that is, where S* is a subsequence of the sequence of keys pointed to by the finger during the 
execution of A{S). An offline algorithm A is given the entire sequence S and the starting tree 
T as input and then outputs the sequence of operations A{S), while an online algorithm is fed 
the keys from S one by one and must output the BST operations for the access of one key before 
the next key is given. More formally, A is online if A{S) is a prefix of A{S') whenever 5 is a 
prefix of S' . The cost of A{S) is the number of BST operations it contains. 

Note that the model, as all the standard variants of the BST model used in competitive 
analysis of online BST algorithms, only requires the algorithm to list the BST operations A{S) 
to be performed (see, e.g, [1S|)- In particular, the model does not restrict how those operations 
are generated, what auxiliary memory is used in order to generated them, or even how much 
time is used to generate them. 

Of course, real-world implementations of practical BST algorithms have some sensible limits 
on their time and space usage. In fact, almost all BST implementations in the literature besides 
adhering to the standard BST model described above also have the following additional features: 
they work in the pointer machine model, use no more space than the tree itself plus 0(1) words 
of balance information in each node of the tree and 0(1) extra working variables, and generate 
their access sequence A{S) in time proportional to the BST model cost of A[S). The majority 
of this paper is devoted to showing how to de-amortize BST algorithms, with a method working 
in the standard BST model. As a final step, we show how to extend the method to maintain 
the additional features just listed, should the BST algorithm being de-amortized have these. 

Denote by OPT the best offline algorithm, that is, OPT{S) is a shortest possible list of 
operations that realizes S. An algorithm A (online or offline) is f{n)- competitive if we have 
A{S) — 0{f{n) ■OPT{S)) for all sequences S. It is dynamically optimal if it is 0(l)-competitive. 

Prior works The study of self-adjusting BSTs to minimize the overall cost over a sequence 
of accesses was initiated by Allen and Munro [2 with their analysis of the move-to-root and 
the simple exchange heuristics, and then by Sleator and Tarjan with the introduction of Splay 
trees [23] , which they conjectured to be dynamically optimal. They show how the running 



1 



time of Splay trees can be upper bounded in several ways as a function of the access sequence. 
They prove the balance theorem (accesses run in 0(\ogn) amortized), the static optimality 
theorem (any sequence of accesses runs within a constant factor of the time to run it on the 
best possible static tree for that sequence; in particular it reaches the entropy bound), the static 
finger theorem (access x runs in 0(logd{x, /)), where d{x, /) is the number of keys between the 
query item x and any fixed finger element /), the working set theorem (access x runs in time 
0{logw{x)) where •w{x) is the number of distinct elements accessed since the previous access to 
x), and the scanning theorem (accessing all nodes in symmetric order takes time 0{n)). They 
also conjectured the dynamic finger theorem (access to y runs in amortized 0(\ogd{x, y)) where 
X is the previous item in the access sequence) , which was subsequently proved by Cole [HI [5] • 
All bounds above are amortized. 

On another front, Wilber _2S] gave a formal analysis of several variants of the BST model, 
providing equivalence reductions between them, and provided two lower bounds on the number 
of operations that any BST algorithm must perform for a given sequence. In particular, he 
proved that the bit reversal sequence requires f2(logri) amortized operations per access. These 
lower bounds were recently generalized in [131 I10| . Splay trees were also shown to be key 
independent optimal I19J, that is, they are 0(l)-competitive if the order of the keys is arbitrary 
or random, and that they are 0(l)-competitive with respect to a wide class of balanced BST 
algorithms [15] . 

New bounds have been designed: the queueish bound (opposite of the working set bound: 
the number of elements not accessed since the last access to x) was shown not to be achievable 
by any BST algorithm [50]. Recent papers have attempted to engineer a BST that satisfies 
the unified property, a bound that implies both the dynamic finger and the working set bound 
[181 [5]. The skip-splay trees [12j perform each access within a multiplicative factor O(logTogn) 
of the unified bound, amortized. The layered working set trees [7] are BSTs that achieve the 
working set bound worst case. By combining it with the skip-splay structure, the authors show 
how to achieve the unified bound, amortized, with an additive cost of O(loglogn). 

The first significant breakthrough on the competitive analysis of BST algorithms came with 
the invention of tango trees 1111, the first provably 0(loglogn)-competitive BST. This result was 
subsequently improved independently by the multi-splay trees [24, and the chain-splay trees [16] 
which both offer the additional guarantee of performing each access in O(logn) amortized time. 
Further properties of multi-splay trees were proved in [14| . where they were shown to satisfy 
static optimality, the static finger property, the working set property, and key-independent 
optimality. They further satisfy the dequeue property which is not known to be satisfied by 
splay trees. 

In recent years, the question was raised as to whether the good amortized properties could 
be reconciled with the 0(log n) worst case bounds satisfied by well balanced trees such as AVL 
or red-black trees. Such results were known for static trees [5], however recent works gave 
indication that strong balance constraints at every node forces the working set bound to be 
an amortized lower bound, thus forbidding any such tree to have stronger properties such as 
the dynamic finger property [4] (the proof was given for self-adjusting skip-lists and B-trees, 
however the proofs can easily be adapted to BST with balance constraints at every node). 
However, it remained open whether relaxing the balance condition to just bounding the height 
of the tree would be compatible with obtaining better amortized performances. In [5], a BST 
based on Tango trees [TT] is engineered to be both 0(loglogn)-competitive and guarantees 
O(logn) worst case access time for each access. However, this structure is unlikely to possess 
all the other desirable properties of Splay trees. 

Our results In this article we show that it is possible to automatically transform any BST 
algorithm into one that provides worst case time guarantees per access while keeping the same 
asymptotic amortized running times. Our core result shows how to keep a BST balanced while 
losing only a constant factor in the running time: 

• Any BST algorithm A on tree T can be transformed into a BST algorithm A' on a tree T' 
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whose amortized cost is within a constant factor of the original algorithm, and for which 
the depth of T' is always O(logn). If A is online, so is A' . 

Using this, we then show how to de-amortize the BST and answer each query in O(logn) worst 
case cost: 

• Any BST algorithm A on tree T can be transformed into a BST algorithm A" on tree T" 
such that for any access sequence S, \A"{S)\ = 0{\A{S)\) and each access to a node is 
performed in O(logn) operations worst case, li A is online, so is A" . 

Finally, we show that we can extend the method to maintain the additional features of real- 
world online BST algorithms described above, in a way which turns amortized upper bounds on 
the BST algorithm into worst case performance per access. In particular, we have: 

• Any online BST algorithm A on tree T that performs k accesses in 0{{n + k) logn) op- 
erations can be transformed into an online BST algorithm A'" on tree T'" such that for 
any access sequence S, \A"'{S)\ ~ 0{\A{S)\) and each access to a node is performed in 
O(logn) operations worst case. If A works in the pointer machine model, with working 
space being 0(1) words of information in the nodes and 0(1) global working variables, 
and computes each access to a key in time proportional to the number of BST operation 
of the access, then so does A'" . 

Applying this transformation to Splay trees, we obtain a BST that executes every sequence 
within a constant factor of the Splay tree and thus satisfies the scanning theorem, the working 
set property, static optimality, the key- independent optimality, the static finger property, the 
dynamic finger property, and that performs each access in 0(log7i) worst case. By transforming 
Multi-Splay Trees, we obtain a BST that is O(loglogn) competitive, satisfies the scanning 
theorem, the working set property, static optimality, the key-independent optimality, the static 
finger property, and performs each search in O(logn) worst case time. Furthermore, if there is a 
dynamically optimal BST algorithm, then there is one that additionally performs every search 
in O(logn) operations worst case. 

Structure of paper In the next section we show how to implement a stack as a binary 
search tree with bounded height. This will be used as a building block in Section [3] to simulate 
the operations of any BST using another BST of bounded height. Finally, we show how to use 
this rebalanced tree to de-amortize the BST algorithm in Section [4] 

2 Pop-Tarts 

We start by implementing a stack using a balanced BST. We differentiate internal nodes, which 
always have two children, and leaves which have no children (leaves can also be seen as empty 
pointers). In order to fit the stack data structure in the BST model, we assume that nodes to be 
pushed onto the stack appear as the parent of the root of the current stack, and that nodes are 
pushed onto the stack in decreasing key order (that is, after the push operation the old stack 
is the right child of the newly inserted node, and its left child is a leaf). Our later application 
of the stack structure fulfils these assumptions. An empty stack is composed of one leaf. The 
structure will maintain the invariant that the left child of the root is always a leaf, to allow 
for easy pop operations. After each push or pop operation, the structure is allowed to perform 
a sequence of operations in the BST model (finger movements and rotations), and at the end 
of the sequence, the finger is back at the root. Leaves can have a weight associated to them, 
and we use the convention that internal nodes all have weight 1 (it would not be difficult to 
generalize these structures to support arbitrary internal weights, however this is not necessary 
for our application). 

A BST implementing a stack in this manner we call a Pop-iar^ A pop-tart is good if push 

^Pop- Tarts are a line of crazy good breakfast products that pop out of the toaster, which remind us of popping 
a stack. Pop-tart is a trademark of the Kellogg Company. 



3 



and pop operations are performed in 0(1) amortized time and O(logn) worst-case time. It is 
crazy good \7T if it is good and the depth of every leaf of weight w is 0{\og{W/'w)), where W 
is the total weight of all leaves in the pop-tart, or O(logn) for an unweighted pop-tart with n 
leaved 

In the remainder of this section, we will describe three pop-tart structures. The first two lay 
down ground concepts that will be used to construct the third pop-tart (Chocolate), which is 
always crazy good. 

Vanilla Pop- Tart. Implementing a good pop-tart is easy. In fact, performing no BST 
operations after each push or pop operation will produce a linear tree with exactly 0(1) time 
per operation. This elementary implementation is called Vanilla Pop-Tart. A vanilla pop-tart 
will be crazy-good if the weight of each pushed leaf is always larger than the total weight of all 
other leaves in the pop-tart. 

Lemma 1 The Vanilla Pop- Tart is crazy good if nodes are added in decreasing key order and 
new leaves have weight larger or equal to the total weight of all other leaves in the pop-tart. That 
is, it uses 0(1) time per push and pop operation and the depth of a leaf of weight w is at most 
1 + log W/w where W is the total weight of all leaves in the pop-tart. 

Proof: The proof is by induction. If the pop-tart contains one leaf, then it is at depth 0, this 
covers the base case. Assume by induction that the lemma is true for the right subtree of the 
root, which is of total weight W . Then the left child of the root is the last added leaf and it 
has weight at least W, thus, W > 2W' . The left child of the root is at depth 1 < 1 -|-logVF/w. 
Any other leaf in the tree by induction is at depth at most 2 + log W' jw < 1 + log W/w. □ 

Cherry Pop- Tart. We now describe the Cherry Pop- Tart, which is a crazy good pop-tart 
if all leaves have weight 1. Although Cherry Pop-tarts are not used explicitly in this paper, they 
serve as a warm up, introducing some key concepts needed to define the Chocolate Pop-tart 
structure, which is used later. 

The algorithm used is a variant of a 2-4 tree implemented as a BST. On a high level, it 
may be viewed as reversing edges on the leftmost path in a red-black tree, and then having a 
permanent finger at the leftmost internal node (effectively making it the root of the BST). 

In greater detail: The Cherry Pop-tart is a BST with the nodes on the right path of the 
tree grouped into layers. A layer consists of consecutive nodes on the right path, and the left 
subtrees of these nodes are called crumbs. The right child of the last node in the layer is the 
top node of the next layer (except for the last layer, where it is the original leaf of the initial 
empty stack). By definition of BSTs, the layers are linearly ordered, that is, all keys in a layer 
are smaller than the keys in the next layer. 

We number the layers as follows: the layer containing the root is layer 0, the next one along 
the right path is layer 1, and so on. We maintain the invariants that each layer has between 1 
and 3 nodes on the right path (hence that many crumbs), and that the crumbs pointed to by 
layer i (called i-crumhs) are perfectly balanced trees containing exactly 2* leaves. See Figure [l] 

The invariant is true for a pop-tart containing one node: that node is layer and it points 
to one 0-crumb (containing one leaf). When a new node is pushed as the parent of the root, 
it is added to layer 0. Layer therefore has one more node and one more 0-crumb. Either the 
new layer still has no more than 3 crumbs, maintaining the invariant, or layer now has 4 
0-crumbs (each composed of exactly one leaf). In this case, we perform a left rotation between 
the last two nodes of the layer. This replaces the last two nodes of the layer with one node 
whose left pointer points to a 1-crumb. We now move that node from layer to layer 1. See 
Figure [2] Again, the reconfiguration could either stop there or ripple down further. In general, 

■^We slightly abuse the big-Oh notation and write 0(Iog(M^/ui)) to mean a function which is smaller than 
c\og{W/w) -\- d for some constants c and d. 
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Figure 1: layers and crumbs of a Cherry Pop-tart. 




Figure 2: Restoring the Cherry Pop-tart invariant at level i. 



as a node is added as the parent of the first node in layer i, either layer i still has no more 
than 3 z-crumbs, or we preform a rotation on the node between the last two crumbs, forming a 
(j + l)-crumb with twice as many leaves which is inserted into layer i + A pop operation works 
symmetrically, by removing the first node of layer (whose left child is a leaf) and restoring 
the invariant, that is, if layer contains no more nodes, we perform a right rotation on the first 
node of layer 1, transforming it into two nodes that are moved into layer 0. If layer I is now 
empty, we repeat the operation on the first node of layer 2 and so on. 

Lemma 2 The Cherry Pop- Tart is crazy good if nodes are added in decreasing key order and 
all leaves have weight 1. That is, it uses 0{1) amortized time and O(logn) worst case time per 
push and pop operation and its tree has height O(logn). 

Proof: To show that a push or pop operation has amortized cost 0(I), we assign a potential 
of to layers with 2 nodes, and a potential of 1 to layers with 1 or 3 nodes. A push or pop 
operation has actual cost proportional to the number of layers that had to be readjusted to 
restore the invariant. Each readjusted layer had a potential of 1 before the operation (i.e., had 
3 nodes before a push or 1 node before a pop) and of after the operation (i.e., has 2 nodes 
exactly). Therefore, the decrease of potential pays exactly for the readjustments. The insertion 
or deletion in the last layer possibly increases its potential by I, which is the amortized cost of 
the operation. Therefore, this pop-tart is good. 

Since layer i has at least one i-crumb containing 2* leaves, a pop-tart with n leaves has at 
most logn layers, each having crumbs of height O(logn), thus the total height of the tree is 
O(logn). So in the unweighted case, this pop-tart is crazy-good. □ 

The next lemma shows that the Cherry Pop-tart is crazy-good even in some weighted cases. 

Lemma 3 The Cherry Pop- Tart is crazy good if nodes are added in decreasing key order and 
new leaves are added with increasing weights. That is, it uses 0{1) amortized time per push and 
pop operation and the depth of a leaf of weight w is 0{\ogW/w) where W is the total weight of 
all leaves in the pop-tart. 



i-crumb 

i-crumb 



Figure 3: Level i in the Chocolate Pop-tart. 

Proof: We use the exact same structure as in the previous lemma. Observe that by the 
conditions in the lemma, an inorder traversal of the tree will meet the leaves in order of decreasing 
weight. Since i-crumbs contain 2* leaves, the layer containing the k*^ heaviest leaf in the pop-tart 
has index at most log k, hence has crumbs of depth at most log k. So the depth of the fc*'* heaviest 
leaf is at most 41og/c. If the k*'^ heaviest leaf is of weight w, then the total weight W of all 
leaves in the pop-tart is at least kw, hence the depth of that leaf is at most 41ogfc < AlogW/w. 
Thus, the pop-tart is crazy good. □ 

In order to allow for arbitrary weight order, we will have to modify slightly the data structure. 
We call the next structure the Chocolate Pop-Tart. 

Chocolate Pop- Tart. Again, the structure will be decomposed into a sequence of layers 
whose nodes form a right path and point to crumbs. This time, the right path of the i*^ layer 
will be composed of 1 to 3 regular nodes whose left child is an i-crumb, then a next node whose 
left child points to the next layer and whose right child points to a subtree called the icing. This 
will be called the structural invariant. See Figure [3] The icing is itself a stack, implemented 
using a Vanilla Pop-tart (that is, a simple linear tree), whose leaves will be /rozer^ subtrees of 
the chocolate pop-tart. In order for the icing to be crazy-good, we will ensure that the nodes 
(frosted subtrees) pushed onto it will always be at least as heavy as the total weight of the icing. 
The subtrees to be frosted and pushed into the icing of level i will always be the next node and 
the entire subtree rooted at the top node of level i -I- 1. Therefore, we maintain the invariant 
that the total weight of layer i + 1 (that is, the the total weight of the subtree rooted at the 
topmost node of that layer) is smaller than the total weight of the icing of layer i (thick icing 
invariant) . If violated, layer i -\- 1 will be frosted and pushed into the icing, to maintain the 
invariant. 

The last layer, say, layer i, is incomplete: it is composed of to 3 regular nodes, has no 
pointer to the next layer, and always contains an icing as its rightmost subtree. It can only have 
regular nodes if the icing contains exactly one element (which is always an j-crumb). 

As before, when a new node is pushed onto the j-layer (starting with i — 0), either the z-layer 
has at most 3 regular nodes, in which case we are done, or it contains 4 regular nodes and we 
need to restore the structural invariant. We start by performing a left rotation between the two 
lowest regular nodes in the layer, creating an {i + l)-crumb. We have two cases to consider. If 
the i*'' layer is not the last one, then we perform a left rotation between the next node and the 
lowest regular node, to move the new (i + l)-crumb and its node to the {i + 1)*'* layer. On the 
other hand, if the i*'' layer is the last one, then it has no next node. Then the lowest regular 
node becomes a next node which points to the new (i + 1)*'' layer. That (z -I- 1)"* layer contains 
regular nodes, no next node and an icing which contains the (i + l)-crumb as its only leaf. 

Having done this, there are again two cases to consider: if the total weight of the subtree 
rooted at the (new) top node of the {i + 1)*'' layer is smaller than the total weight of the icing of 
the layer, then we proceed with the insertion of the {i + l)-crumb, by restoring the structural 

^or frosted 
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invariant if necessary, and so on. Otherwise, we restore the thick icing invariant by frosting the 
(i + 1)*'' layer without modifying it further (even if it contains now 4 regular nodes), and push 
it and its parent node (the next node of the i*^ layer) into the icing of the z*'* layer. The i*'* 
layer then becomes the last layer. It has no next node; and two regular nodes. 

The deletion operation is symmetric: when the first regular node of the i*^^ layer is deleted, 
either the layer still has at least one regular node left, in which case we are done, or we have to 
restore the structural invariant. If i is not the last layer, we pull two nodes and their associated 
i-crumbs from the {i + 1)*'* layer (by performing two right rotations and possibly recursively 
restoring the invariant in the (i + 1)*'' layer). If the [i + 1)*'' layer is only composed of an icing 
(which then contains one frosted {i + l)-crumb), we defrost the icing, perform a right rotation, 
transforming the next layer into two regular nodes pointing to i-crumbs and the i*^ layer becomes 
the last one. On the other hand, if i is the last layer, then we pop a frosted subtree from the 
icing (unless it contains only one leaf), and perform a right rotation to turn the frosted subtree 
into one regular node and a next node, the latter pointing to the new, unfrosted, (i + 1)*'' layer 
and to the remaining icing. 

Lemma 4 The Chocolate Pop- Tart is crazy good if nodes are added in decreasing key order and 
new leaves are added with arbitrary weights. That is, it uses 0{1) amortized time per push and 
pop operation and the depth of a leaf of weight w is 0{\.ogW/w) where W is the total weight of 

all leaves in the pop-tart. 

Proof: We first show that the Chocolate Pop-tart is good, that is, it uses 0(1) amortized time 
per push and pop operation. For this, we assign a potential of to layers with 2 regular nodes, 
and a potential of 1 to all other layers. A push operation will cause a bunch of reconfigurations 
in successive layers, that end in either adding a crumb to a layer that does not overflow, or 
pushing an element in the icing of a layer. Either case costs 0(1) amortized. As in the case of 
Cherry Pop-tarts, it is easily verified that every layer that overflows had 3 regular nodes before, 
and thus a potential of 1, and two regular nodes after, so a potential of (except possibly for 
the last rearranged layer). Likewise, during a pop operation, the potential of a rearranged layer 
(except the last one) goes from 1 to since the number of regular nodes it contains goes from 1 
to 2. Thus, the decrease of potential of a layer during a push or a pop pays for its rearrangement, 
while the amortized cost of 0(1) pays for the potential increase and the rearrangement in the 
last node and the push in the icing if it occurs. 

It now remains to prove that the depth of a node of weight w is 0(log W/w). The proof will 
be by induction on the layer number. Consider the subtree rooted at the first node of the i*^ 
layer and let Wi be the total weight of that subtree. Assume by induction that at any moment 
in the algorithm, any leaf of weight w has depth i-\-&-\-7 log Wi/w starting from the root of the 
z*'' layer. We want to show that in the subtree rooted at the first node of the [i — 1)*'* layer, any 
leaf of weight w has depth (i — 1) + 6 -I- 7\ogWi-i/w. Obviously, the hypothesis is true for an 
i*'' layer that contains only an icing with one frosted i-crumb, since all its leaves are at distance 
i; this covers the base case. 

For a (i — 1)*'* layer, we consider the leaves located (i) in {i — l)-crumbs pointed by regular 
nodes, (ii) in the i*'* layer if it exists, and (iii) in the icing of the {i — 1)*^ layer. Any leaf of 
type (i) is at distance < 3 -|- i — 1 which is small enough. For type (ii) leaves, notice that as long 
as j-crumbs are being moved from the {i — 1)*'' layer to the i*^ layer without being frosted and 
pushed to the icing, Wi < Wi-x/2. Therefore, for any leaf of weight w in the subtree of the i*^ 
layer, the depth of that leaf is at most 

A + i + % + ^\ogW^/w < 10-t-i-F 71ogWi_i/w; -7 <{i- 1) 4 Tlog VFj-i/w 

which is below the desired bound. 

Finally for case (iii), since the icing of the {i — 1)*'* layer is implemented as a Vanilla pop- 
tart and the frosted subtrees are pushed with (total) weights always larger than all other leaves 
(frosted subtrees) in the icing, the icing is crazy good, that is, a frosted subtree of total weight 
W will have its root at depth at most 5-|-log Wi-i/W. Let p be the parent of the frosted subtree 
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containing the node of weight w, let Wp be the weight of the subtree rooted at p. The depth 
of p is at most 4 + log Wi-i/Wp since the left child of every node on the right path of the icing 
contains at least half of the weight of that node. Every frosted subtree has its first node whose 
left pointer points to a possibly heavy j-crumb, and whose right pointer points to what used to 
be the i*^^ layer at some point in time. Let W' be the weight of that i*'' layer. Then W < Wp/2 
otherwise the i*'* layer would have been frosted earlier. By induction, a leaf of weight w in this 
former i*'* layer must have depth no more than 

4 + log W,-i/Wp + 2 + i + 6 + 71og W'/w 

< 12 + z + log W,^i/Wp + 7 log Wp/w - 7 

< {i-l)+6 + 7logW,-i/w 

which is the desired bound. A leaf in the i-crumb pointed by the left pointer of the root node 
of the frosted subtree has weight at most Wp, and its depth is 

4 + logT4^,_i/VKp + 2 + i < {i-l)+6 + 7\ogW^-i/w. 

This completes the induction proof. For i — 0, we have that any leaf of weight w has depth at 
most 6 + 71ogT4^/w, so the chocolate pop-tart is crazy-good for arbitrary weights. □ 

Note that all pop-tarts described in this section can also be flipped to maintain elements 
pushed in increasing order. If the cherry or chocolate pop-tarts need to be implemented in a 
real-world BST, 0(1) extra bits of information in each node is sufficient for storing the function 
of that node (regular, next, icing, crumb). 

3 Simulation 

We now show how to efficiently simulate any BST algorithm while keeping the tree of logarithmic 
height. The method will work for trees with weighted nodes as well. Let Wi be the weight of 
the node with key i and let W = F'or unweighted trees, set Wi = 1 and W = n. 

We represent the tree T of the original BST algorithm using a heavy path decomposition. To 
construct this decomposition, we denote every edge of T as either solid or dotted. For each 
non-leaf node, the edge to its child with largest total subtree weight (or the left child, in case of 
a tie) is a solid edge, and the edge to its other child is dotted. The solid edges form heavy paths 
connected together by dotted edges. 

We simulate the original BST algorithm as follows: When its finger is at the root of T, each 
heavy path is implemented using a pair of weighted pop-tarts: a heavy path from node y to node 
X (with y an ancestor of x) is a sequence of nodes that can be decomposed into the subsequence 
L{y,x) of nodes smaller than x on the path, and the subsequence R{y,x) of nodes larger than 
X on the path. Note that L{y,x) is increasing, and R{y,x) is decreasing. In our simulation, the 
end of the path x does not change, but y can move up or down along the path to the root. As 
y moves up, the new nodes are added to L(y, x) in decreasing order, or to R{y, x) in increasing 
order. 

The sequences L(jj,x) and R{y,x) will each be stored in the weighted chocolate pop-tart 
structure described in the previous section, and these two pop-tarts will be left and right children 
of X, respectively, see Fig. [4] Each node on the path is connected via a dotted edge to a subtree 
which will be considered as a leaf in the pop-tart, whose weight is exactly the total weight of 
all the nodes in that subtree. The subtrees contained in those leaves will be structured in the 
same manner, recursively. The nodes in the tree will contain two extra bits, one to determine if 
the edge to its parent node is solid or dotted, and another to determine if the next node on its 
heavy path is in L{y,x) or R{y,x). 

When the finger / is not at the root r of the tree, the path from the finger to the the 
root is also represented as a pair of pop-tarts in a similar way, but this time upside-down (see 
Fig. [5]). Thus, as / walks down, the elements of L{r, /) are added in increasing order, and the 
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Figure 5: Representing the finger in general position. 



elements of R{r, /) are added in decreasing order. Hence, finger movements in the original BST 
algorithm can be implemented using one push and one pop operation by transferring a node 
from one pop-tart to the other using 0(1) rotations. Likewise, rotations in the original BST 
algorithm only involve the first few nodes on the pop-tarts linked from the finger, and thus can 
be implemented in 0(1) rotations and push/pop operations. Note that the finger in the tree 
maintained by our simulation always stays at the root. 

Any path from the root to a node x of weight w uses at most log W/w dotted edges. Fur- 
ther, let Wi,W2, . . . , Wk be the total weights of the successive heavy paths (along with their 
descendants) on the path from the root to x. By Lemma l4l the i*'' heavy path will be stored 
at depth 0{\og{Wi^i/Wi)) in the pop-tart of the {i — l)*'Mieavy path, and node x will be at 
depth 0{log{Wk/w)) in the pop-tart of the last heavy path. Thus, the total depth of x in the 
tree is bounded by a telescoping sum that sums up to 0{log{W/w)). Clearly, if A is online, so 
is A'. We obtain: 

Theorem 5 Given a BST algorithm A with a starting tree T, there is a BST algorithm A' with 
a starting tree T' such that \A! {S)\ — 0{\A{S)\), and such that the depth of a node i in T' is 
always 0{\og{W/wi)) and the finger is always at the root oj T' . If A is online, so is A! . 

If the simulation needs to be implemented in a real-world BST, 0(1) extra bits per node 
is sufficient for storing the structure of the original tree and the function of each node in the 
simulation: each node needs to indicate wether each of its children is part of the same heavy 
path or not, and for all nodes on the path from r to /, a bit must be stored to determine if 
the next node on the path is stored in L(r, /) or in i?(r, /). Note that if T is unbalanced, it is 
necessary to restructure it into T' in order to obtain the depth bound. However if the starting 
position of T already have this property, we can start with T unchanged and restructure the 
tree during the execution of the algorithm every time the finger enters a yet unexplored subtree. 
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4 De-amortization 



We are now ready to show how to de-amortize BST algorithms. 

Theorem 6 For any BST algorithm A with a starting tree T there is a BST algorithm A'' with 
a starting tree T" such that for any access sequence S, \A" {S)\ ~ 0{\A{S)\) and each access to 
a node is performed in O(logn) operations worst case. If A is online, so is A" . 

Proof: Using Theorem [sj transform A and T into A! and T' such that the depth of node 
i in T' is always c log n for some constant c. Algorithm A! is then modified in the following 
way: while running the sequence of operations in A'{S), every time clogn operations from the 
original A!{S) sequence have been performed without accessing the next unaccessed element of 
the input sequence, access this element by moving the finger to it and back (thereby inserting 
< 2c log n extra BST operations into the sequence at this point). Thus every access is performed 
in worst case 3c log n, and the total cost of the sequence is the same within a factor 3. If A is 
online, so is A" . □ 

Again, it is usually necessary to transform the starting tree in order to achieve a O(logn) 
worst case bound per access, for example in the case when T is very unbalanced. If the starting 
tree T has height O(logn) however, it is not necessary to modify it and we can, as described 
above, restructure the tree during the execution of the algorithm every time the finger enters a 
yet unexplored subtree. 

As mentioned in the introduction, real-world BST algorithms normally work in the pointer 
machine model, with working space for the algorithm being 0(1) words of information in the 
nodes of the tree, and 0(1) global working variables. Additionally, they can be implemented to 
find their BST operations for a key si in time proportional to the number of these operations 
(i.e., in time proportional to the cost in the BST model). 

A natural goal is that our de-amortized output algorithm should adhere to these constraints 
if the input algorithm does. We now describe how to do this, given bounds on the amortized 
behaviour of the input BST algorithm. In particular, we show how any real-world online BST 
algorithm with O(logn) amortized time bounds (such as e.g. Splay Trees) can be transformed 
into an online BST algorithm with O(logn) worst case time bounds, while not changing their 
running time on any sequence by more than a constant factor. In the following theorem, the 
formulation is slightly more general. 

Theorem 7 Let f{n) be a function in Q{\ogn). For any online BST algorithm A that for 
some starting tree T is guaranteed to perform k accesses in 0{nf{n) + kf{n)) operations, there 
is an online BST algorithm A'" and a starting tree T'" such that \A1" {S)\ = 0(|yl(5')|) for 
any access sequence S, and such that A'" performs each access to a node in 0{f{n)) operations 
worst case. If A works in the pointer machine model, with working space being 0(1) words of 
information in the nodes and 0(1) global working variables, and computes each access to a key 
in time proportional to the number of BST operation of the access, then so does A!" . 

Proof: The general idea is the same as in Theorem[6j except that log n is replaced by f{n). The 
main problem to overcome is that for some s^'s, the number of BST operations of A! may be 
larger than f{n), due to the amortization in A (and the amortization added when transforming 
it into A'). These BST operations cannot all be executed before the access to Si has to be 
finished by a traversal of the balanced tree of A' and st+i has to be served next by the online 
algorithm A'" . In short, in the execution of A' , its point in the input sequence can lag behind 
that of A'" , and the problem is how A'" efficiently can keep track of what operations to do next 
when executing A! . 

We do this by maintaining a queue Q containing the keys whose accesses have already been 
performed in A'" but whose BST operations in the execution of A' still have to be done. Thus, 
Q always contains a (possibly empty) suffix of the keys si, S2, . . . , s^, where Si is the key last 
accessed by A'" . The A' operations of the oldest key in Q may be partly executed, and we store 
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the state of the process of J\! on that key in the global variables of J^" (we assume such a state 
can be stored in 0(1) words for A, which implies that it also can be done for y^'). 

To adhere to our notion of practical BST algorithms, we implement the queue by a linked 
list of queue nodes, with each node of the list stored in a node of the tree as a pair of words. The 
first word is the key stored at that position in the queue and the second word is a list pointer to 
the next queue node, represented by the key of the tree node containing that queue node. Note 
that following a list pointer may require walking O(logn) steps in the tree and so enqueue or 
dequeue operations will cost that much. 

We now give the details of A!" . It uses the following three basic routines. A: Restart the A! 
process of the last key in Q, and then perform A! process work on this and the following keys Q, 
doing a dequeue each time the process for the last key finishes. The routine ends when A!" has 
done d ■ f{n) BST operations, or Q runs empty. Here, d is a constant to be determined later. 
B: Perform A' BST operations on the newest key of A'" . The routine ends when f{n) such 
operations have been performed, or the operations have all been done. C: Access the newest 
key of A'" by a search in the tree maintained. Enqueue the key in Q. 

Given these routines, the actions of A'" on the next input key are: 

IF Q is not empty: 
do A 

IF Q is [now] empty: 
do B 

IF routine B ended by all operations being done: exit [skipping C below] 
do C 

This takes 0{f{n)) time worst case, since each of the routines do. We now want to argue 
that |v4"'(5')| = ©([^'(S")!) on any sequence S, by charging all work of A'" to work of A' that has 
been executed. There are five different types of actions of A!" possible, with routine sequences 
as follows: AB, ABC, AC, B, and BC. 

The action B starts and ends with Q empty, and takes time proportional to the work done 
on A! , hence that work can be charged. The action BC starts with Q empty, ends with Q non- 
empty, takes time 0{f{n)), and does f{n) work on A', hence that work can be charged. The 
action ABC starts with Q non-empty, ends with Q non-empty, has Q empty in the meantime, 
takes time 0{f{n)) and does f{n) work on A' , hence that work can be charged. The action AB 
starts with Q non-empty, ends with Q empty, takes time 0{f{n)) but does possibly only 0(1) 
work on A' . However, the action (either BC or ABC) during which Q last turned from empty 
to non-empty did f{n) work on A' , hence that work can be charged. 

Remaining is the action AC, which has Q non-empty from start to end. Note that in the 
A part, the d ■ f{n) BST operations of A'" for f{n) = logn can be a couple of dequeues (each 
taking logn operations), all of keys for which there are a constant amount of A' work. Hence, 
we cannot charge the A' work on a per-action basis, and a more elaborate charging argument is 
needed: Consider a sequence of t AC actions, following an action (either BC or ABC) during 
which Q last turned from empty to non-empty. There have been exactly t + 1 elements enqueued 
(during C parts) since the queue was last empty, of which (at least) the last is still present. 
Hence, during the t AC actions at most t dequeues can have been done. Also, exactly t restarts, 
t key accesses via the balanced tree, and t enqueues have been done. All these sum up to at 
most ct ■ logn BST operations for A'", for some constant c. Let c' be a constant such that 
f{n) > c' ■ logn. At least dt ■ f{n) > dc't ■ logn BST operations have been done by A'" during 
the t actions, and those not included in the sum above must be A' process work. Thus, by 
choosing d large enough that dc' > 2c, at least half of the work done must be A' process work. 
Hence, we can charge all A'" work to A' work executed during the t actions. 

Summing up, over the entire sequence, all work of A'" can be charged to executed work of 
A' , with no work of A' being charged more than a constant number of times. Hence, |y^"'(S')| — 
0{\A'iS)\). By \A'iS)\ = 0{\A{S)\) from Theorem [sj the claim \A'"iS)\ = 0{\AiS)\) of 
Theorem [7] follows. 
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Finally, as the queue has only room for n elements, we need to guarantee that the queue 
does not overflow. The queue overflows exactly when n AC actions in a row have taken place. 
By the argument above, at least 1/2 ■ dn ■ f{n) A' work has been executed, which for d large 
enough leads to a contradiction with the guarantee on A^s performance. 

□ 

We note that one feature of BST algorithms not maintained by Theorem [7] is the exact 
amount of information stored in tree nodes, besides the search key. For classical BST algorithms, 
this varies from zero bits in Splay trees, over one bit in red-black trees, two bits in AVL-trees, to 
8(logn) bits in weight-balanced trees and treaps. Algorithm A'" from Theorem [t] always uses 
9(logn) bits. 
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